Heuristic Principal Component Analysis-Based Unsupervised Feature Extraction and Its Application to Bioinformatics
نویسندگان
چکیده
Feature Extraction (FE) is a difficult task when the number of features is much larger than the number of samples, although that is a typical situation when biological (big) data is analyzed. This is especially true when FE is stable, independent of the samples considered (stable FE), and is often required. However, the stability of FE has not been considered seriously. In this chapter, the authors demonstrate that Principal Component Analysis (PCA)-based unsupervised FE functions as stable FE. Three bioinformatics applications of PCA-based unsupervised FE–detection of aberrant DNA methylation associated with diseases, biomarker identification using circulating microRNA, and proteomic analysis of bacterial culturing processes–are discussed.
منابع مشابه
Feature reduction of hyperspectral images: Discriminant analysis and the first principal component
When the number of training samples is limited, feature reduction plays an important role in classification of hyperspectral images. In this paper, we propose a supervised feature extraction method based on discriminant analysis (DA) which uses the first principal component (PC1) to weight the scatter matrices. The proposed method, called DA-PC1, copes with the small sample size problem and has...
متن کاملDevelopment of a cell formation heuristic by considering realistic data using principal component analysis and Taguchi’s method
Over the last four decades of research, numerous cell formation algorithms have been developed and tested, still this research remains of interest to this day. Appropriate manufacturing cells formation is the first step in designing a cellular manufacturing system. In cellular manufacturing, consideration to manufacturing flexibility and productionrelated data is vital for cell formation....
متن کاملComparative Analysis of Wavelet-based Feature Extraction for Intramuscular EMG Signal Decomposition
Background: Electromyographic (EMG) signal decomposition is the process by which an EMG signal is decomposed into its constituent motor unit potential trains (MUPTs). A major step in EMG decomposition is feature extraction in which each detected motor unit potential (MUP) is represented by a feature vector. As with any other pattern recognition system, feature extraction has a significant impac...
متن کاملImproved Algorithm for Fully-automated Neural Spike Sorting based on Projection Pursuit and Gaussian Mixture Model
For the analysis of multiunit extracellular neural signals as multiple spike trains, neural spike sorting is essential. Existing algorithms for the spike sorting have been unsatisfactory when the signal-to-noise ratio (SNR) is low, especially for implementation of fully-automated systems. We present a novel method that shows satisfactory performance even under low SNR, and compare its performan...
متن کاملAn application of principal component analysis and logistic regression to facilitate production scheduling decision support system: an automotive industry case
Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be ineff...
متن کامل